System for optimizing server use in a data center

ABSTRACT

A system and method for analyzing the usage of servers and their associated applications and providing suggestions on how to best make use of the resources of the servers. Numerous forms of analysis are utilized to provide suggestions to a user for efficient use of the servers. Should a suggestion be accepted by a user a conversion between servers is conducted. Optionally a conversion may be conducted automatically based upon a suggestion.

COPYRIGHT NOTICE

A portion of the disclosure of this patent document contains materialwhich is subject to copyright protection. The copyright owner has noobjection to the facsimile reproduction by anyone of the patent documentor the patent disclosure as it appears in the Patent and TrademarkOffice patent file or records, but otherwise reserves all copyrightrights whatsoever.

BACKGROUND OF THE INVENTION

Computing systems are growing rapidly in size and complexity. Manybusinesses have data centers consisting of a multitude of servers. Insuch an environment servers will have different configurations ofhardware and software, including operating systems.

One of the problems in managing a data center is moving OperatingSystems, applications and data between servers, especially servers withdifferent technology architectures, and load balancing across and withinthe architectures to provide optimal use of the servers. Generally,optimization occurs only within a common architecture.

Moving an Operating System, related applications and data from a sourceserver to a target server traditionally requires that all software onthe source server be reinstalled on the target server. This is often nottrivial. The source server may have legacy applications that cannot bereinstalled. Further the source server may be utilizing a version of anoperating system that is not supported on the target server. The sourceserver and target server may also differ in device drivers andconnections to peripherals. Typically the individual performing thetransfer must have direct contact with the source and target machines toinsert media and enter commands.

Load balancing requires a user to determine which software applicationsshould run on which servers and when. In a large data center this is acomplex problem. Load balancing is a constantly moving target as bothapplications and server configurations change. The user must be aware ofall applications, the amount of resources they require and when theapplications use the resources of a server.

SUMMARY OF THE INVENTION

Some embodiments of the present invention are directed to a system forremotely monitoring usage of machines in a data center and suggestingconversions between machines to make efficient use of the resources inthe data center, the system comprising a data collection engine and anoptimization engine operatively coupled to the data collection engine.

Some embodiments of the present invention are directed to a method forremotely monitoring usage of machines in a data center to make efficientuse of resources in the data center, the method comprising the steps ofcollecting performance and machine data, analyzing the data, andsuggesting conversions between machines.

BRIEF DESCRIPTION OF THE DRAWINGS

For a better understanding of the present invention, and to show moreclearly how it may be carried into effect, reference will now be made,by way of example, to the accompanying drawings which aid inunderstanding an embodiment of the present invention and in which:

FIG. 1 is a block diagram illustrating the conversions between machines;

FIG. 2 is a block diagram of a system utilizing an embodiment of thepresent invention;

FIG. 3 is a block diagram of a data center;

FIG. 4 is a block diagram illustrating the interactions between thePowerConvert and OFX modules;

FIG. 5 is a block diagram illustrating machine hierarchy;

FIGS. 6 a and 6 b are a flow chart of the functionality of PowerConvert;

FIG. 7 is a block diagram of the components of an OFX controller;

FIG. 8 is a block diagram of the components of PowerRecon;

FIG. 9 is a flow chart of the functionality of the analysis portion ofPowerRecon; and

FIG. 10 is a block diagram of the components of PowerOptinize.

DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION

Referring first to FIG. 1 a block diagram illustrating the conversionsbetween machines is shown. There are three types of machines, Physical10, Virtual 12 and Image 14. Physical machines 10 are servers upon whichan operating system and its related software applications run. Virtualmachines 12 emulate a specific environment and run on virtual machineserver software. For example, some virtual machines 12 may run on aversion of Linux, others on versions of Windows. Through the use ofvirtual server software such as ESX and GSX provided by VMware Inc., andMicrosoft Virtual Server (hereinafter referred to as MSVS), multiplevirtual machines 12 may be deployed on a physical machine 10. Othervirtual servers such as Xen which is open source, Virtual Iron andSW-Soft may also be supported by the embodiments of the presentinvention. Machine Images 14 are stored copies of the state of aphysical machine 10 or a virtual machine 12 at a specific time.

The conversion from a physical machine 10 (P) to a virtual machine 14(V) is referred to as P2V. Similarly the conversion from a virtualmachine 14 (V) to a machine image 12 (I) is referred to as V2I. Ingeneral, X is used whenever the source or target is independent of thetype of source or target. For example, I2X represents a conversion froma machine image to any other type, physical, virtual or image. In totalthere are nine possible conversion types as shown in FIG. 1. The intentof FIG. 1 is to illustrate that any machine may be converted from one tothe other utilizing the present invention.

Referring next to FIG. 2, a block diagram of a system utilizing anembodiment of the present invention is shown generally as 20. Datacenter 22 is where the machines reside and is shown in greater detail inFIG. 3. In managing data center 22 a user may wish to move operatingsystems, applications and data between machines 10, 12, and 14 dependingon the addition of new servers, load balancing and disaster recovery.For example, it may be determined that a virtual machine 14 would runmore efficiently on a different physical machine 10. Machine images 12allow for the state of a physical machine 10 or a virtual machine 14 tobe backed up and restored as needed. A machine image 12 is stored on animage server, which serves as a host for the image. The only differencebetween a machine image 12 and a virtual machine 14 is that the machineimage 12 may not be started. To start a machine image 12 it must bemoved to either a physical machine 10 or a virtual machine 14.

The conversion between machines is directed by PowerConvert 24.PowerConvert 24 resides on a server and has a distinct URL. Through theuse of a Graphical User Interface (GUI) 26, a user may manage themovement of operating systems, applications and data between machines10, 12 and 14 residing in a network of machines shown as data center 22.PowerConvert 24 obtains information on machines within data center 22 asselected by the user through GUI 26 and allows the user to moveoperating systems, applications and data between machines. OFX 28controls and reports on the jobs requested by PowerConvert 24 and clientapplications 30. OFX 28 resides on a server and has a distinct URL. Inessence OFX is a generic job management engine that remotely executesand monitors jobs through OFX controllers 44 (see FIG. 3). Applicationscan be created through the use of OFX functionality.

PowerRecon 32 accesses servers in data center 22 to monitor the serversand collects statistical data on real and virtual machine usage.PowerRecon may be thought of as data collection engine. PowerOptimize 34utilizes the data gathered by PowerRecon 32 to suggest to a user optionsfor optimizing the use of the servers in data center 22. PowerOptimize34 may be thought of as an optimization engine. Should the user wish tohave PowerOptimize 34 automatically act upon the options selected,PowerOptimize 34 instructs PowerConvert 24 to perform the optimizations.

Referring now to FIG. 3, a block diagram of a data center 22 is shown.Data center 22 is a repository of various types of machines in variousnumbers. In the example shown there is a physical machine 10, a virtualserver 40 hosting a plurality of virtual machines 12 and a machine imageserver 42 hosting a plurality of machine images 14. A virtual server 40is a computer running virtual server software such as MSVS, ESX, or GSX.Through the use of virtual server software, multiple virtual machinesmay exist. Machine image server 42 is a computer that controls thestorage of a number of machine images 14. Generically, a machinecontainer is either a virtual server 40 or a machine imager server 42.Also, the term contained machine is used when making reference to eithera virtual machine 12 or a machine image 14.

FIG. 4 is a block diagram illustrating the interactions between thePowerConvert and OFX modules.

PowerConvert 24 comprises four main components: PowerConvert BusinessServer 42, Database 54, PowerConvert Web Services Interface 56, andPowerConvert Controller 58. PowerConvert Business server 42 handlesrequests to convert from a source machine to a target machine. Indatabase 54 it stores archived operations and device driver information.Database 54 contains information of a set of device drivers necessarywhen converting a machine. Users and client applications 30 communicatewith the PowerConvert Business Server 42 through PowerConvert WebServices Interface 56. In one embodiment PowerConvert Web ServicesInterface 56 utilizes Simple Object Access Protocol (SOAP) overHypertext Transfer Protocol (HTTP) to provide a standard interface.PowerConvert Controller 58 is an instance of an OFX controller 44 (seeFIG. 3), but its role is specialized. It is responsible for runningdiscovery jobs, and jobs that guide the overall conversion process,which includes deploying other controllers to remote machines, whennecessary. Primarily, it is responsible for handling any requests toPowerConvert 24 that cannot be fulfilled synchronously in the time of atypical http request/response.

OFX 28 controls and reports on the jobs requested by PowerConvert 24 andclient applications 30. OFX 28 controls and reports on the jobsrequested by PowerConvert 24 and client applications 30. OFX 28 resideson a server and has a distinct URL. In essence OFX 28 is a generic jobmanagement engine that remotely executes and monitors jobs through OFXcontrollers 44 (see FIG. 3). Applications can be created through the useof OFX functionality. The core of OFX 28 is OFX Business Server 60,which runs jobs through OFX Controllers 44. OFX Business Server 60 ispassive; it is a web server and responds to communication from OFX WebServices Interface 46. In one embodiment OFX Web Services Interface 46utilizes Simple Object Access Protocol (SOAP) over Hypertext TransferProtocol (HTTP) to provide a standard interface. OFX Business Server 60stores all information on requests, the status of requests and machineconfiguration information in a database 62. In operation OFX BusinessServer 60 receives information on the status of request from OFX WebServices Interface 46 through OFX controllers 44 (see FIG. 3) installedon machines in data center 22.

PowerConvert 24 is a fully automated solution for OS portability. Thatis PowerConvert 24 can move the entire contents of a machine, includingits operating system, applications and data to another machine.PowerConvert 24 will convert a source machine to a target machine. Asdiscussed earlier, the types of source and target machines are Physical(P), Virtual (V), and Image (I). The steps required for each of the ninepossible conversion types are illustrated in Table 1 below. Note thatthe first four rows refer to discovery steps. Discovery steps areprerequisites to the conversion taking place. If the desired source andtarget machines cannot be discovered, the conversion will not takeplace.

Depending on the source machine and the target machine types used in aconversion, the actual steps used in the conversion process differ.Typically, either a step can be omitted because it is not needed, or adifferent step needs to be inserted because of the special processinginvolved for that conversion type.

There are some prerequisites before the conversion can begin. First, theappropriate source and target machines must be discovered. Next, theuser must initiate and configure the parameters that define theconversion process. By default, the target machine will be configuredwith essentially the same properties as the source machine. Thisincludes the hostname, amount of RAM, network configuration, number andsizes of disks, and other information. Using GUI 26, the user thenmodifies the configuration of the target machine to suit their needs.This may include changing the hostname or changing the memory size ofthe target machine.

The conversion process is defined in a set of OFX jobs and actions thatrun on various OFX controllers 44 installed on machines throughout thedata center.

The conversion process is guided by a job running on PowerConvertController 58. Each action (or step) in the job is run in sequence.PowerConvert Controller 58 cannot be expected to perform the entireconversion process, since the conversion is almost always distributedamong several machines in the data center. Whenever the ‘next’ step inthe conversion process needs to be run on a remote machine (for example,an ESX server on which a virtual machine will be created), it is theresponsibility of the job running on PowerConvert Controller 58 toschedule the appropriate job to run on the appropriate OFX controller 44(see FIG. 3). It does this by calling OFX 28.

Table 1 below indicates which steps need to be executed for the givenconversion type. TABLE 1 Discovery and PowerConvert Steps DescriptionP2V V2V I2V P2I V2I I2I P2P V2P I2P Discover x x x x x x Source Discoverx x x Source Machine Container Discover x x x x x x Target MachineContainer Discover x x x Physical Target Install x x x Controller onSource Machine Container if necessary Install x x x x x x Controller onTarget machine Container if necessary Create VM x x x Take Control x x xof Target Machine Create x x x x x x Volumes on Target Take Control x xx x x x of Source Machine Copy Volumes x x x x x x x x x from Source toTarget Prepare OS x x x x x x to boot Restart Target x x x x x xConfigure OS x x x x x x in Target Restart Source x x x x x x Machine(optional)

When PowerConvert 24 has been instructed to perform a conversion from asource machine to a target machine, it needs to provide instructions toand receive status information from those machines. This is done throughOFX 28 via OFX Web Services Interface 46.

A user through the use of GUI 26, or an application through clients 30,opens a discover machine dialog and provides the machine identificationsuch as a hostname or IP address and their credentials. This results ina job being scheduled on PowerConvert controller 58 to discover theinformation about the source machine. Once complete the informationcollected is forwarded to OFX 28 and stored in database 62. Thediscovery gathers all the necessary information needed for a conversion,as well as some other information that may be useful to the user. Theinformation includes all of the machine's components: processors, disks,network adapters, the amount of memory on the machine, details about theoperating system, and the network connections.

If the source machine is running Windows, PowerConvert 24 makes use ofWMI (Windows Management Instrumentation) to remotely query the sourcemachine. Since not all of the information that PowerConvert 24 needs isavailable through WMI, some other means to gather information areutilized. For example the physical address Media Access Control (MAC) ofeach Network Interface Card (NIC) and the properties of each disk volumeare queried by deploying a small executable program to the sourcemachine, running it, and copying back the data it generates. In the caseof a Linux source machine, PowerConvert 24 communicates with the sourcemachine using a secure protocol such as Secure Shell (SSH). PowerConvert24 copies a small executable program to the source machine runs it, andcopies back the data it generates. In one embodiment the data isprovided as Extended Markup Language (XML).

FIG. 5 is a block diagram illustrating machine object hierarchy, showngenerally as 60. Each machine type is defined in an XML schema. To aidthe reader in understanding this mapping of machines, FIG. 3 describesthe physical presence of the machines. FIG. 5 describes the hierarchicalstructure of how machines are described using XML. Machine 62 is thebase type from which all other machine types are derived. Each derivedtype defines additional properties that are not present in its basetype.

By way of example we illustrate three types of virtual machines 70. Theyare Microsoft Virtual Machine 72, VMware ESX Virtual Machine 74 andVMware GSX Virtual Machine 76. An example of the XML describing a VMwareESX Virtual Machine 74 is provided in Appendix 1.

In the case of conversion of a machine image 14 as a source machine, adiscovery of its machine container occurs. Discovery of a machinecontainer is a matter of determining whether the source machine has acertain property. That is, is it ESX, GSX, MSVS or an Image Server?After determining that the source machine is a machine container,queries are made to determine the properties of each machine containeron the source machine. These properties include the version of theapplication (e.g. ESX v2.5), any special devices that are configured onthe machine (e.g. the list of virtual NICs) as well as a list of all ofthe contained machines. If the source machine is a machine image 14 thenthe server on which it resides must be discovered, since a machine image14 cannot be started, it cannot be discovered directly.

In the case of a conversion to a target virtual machine 12 or a machineimage 14 a discovery of a machine container occurs. This discovery isthe same as the previous discovery step mentioned for conversion of asource machine that is an image.

If the target machine is a physical machine 10 a discovery is made ofthe physical machine 10. In one embodiment this may require manualeffort by the user, who must boot the machine using a PowerConvert bootCD, since it is expected that the physical machine may be bare. The bootCD contains a copy of Windows Preinstallation Environment (WinPE). Theboot CD also contains a Windows application to assist the user todiscover and register the machine with PowerConvert 24. The applicationprompts the user for the URL of PowerConvert 24 and the credentials withwhich to access it. This results in PowerConvert 24 instructing OFX 28to create an OFX controller entry in database 62 for that machine. Next,an OFX controller 44 is downloaded from OFX 28 into the WinPEenvironment and installed and configured. A discovery job is thenscheduled to run on this controller. The discovery job collectsinformation about the physical machine such as: memory size, number ofprocessors, speed of processors, number and sizes of disks includingpartitions and volumes and all available components, including networkadapters and hard disk controllers. This machine information is storedby OFX 28 in database 62. In the case of Linux a Linux ramdisk is usedinstead of the boot CD. All other steps remain the same.

Referring now to FIGS. 6 a and 6 b a flow chart of the functionality ofPowerConvert 24 is shown generally as 80. Flow chart 80 refers to thesteps conducted by PowerConvert 24 should the above mentioned discoverybe successful. If discovery is not successful, information on the sourceand target machine will not be provided to the user via GUI 26 to allowthem to initiate conversion.

Beginning at step 82, an OFX controller 44 is installed on the sourcemachine container, if necessary. When the source machine for aconversion is a machine image 14, PowerConvert 24 manages machine image14 by running jobs on its host machine image server 42. Thus, an OFXcontroller 44 must be deployed to the host machine image server 42. Ifthe host machine image server 42 already has an OFX controller 44installed on it from an earlier conversion, then this step can beskipped.

Moving now to step 84, if the target machine is a virtual machine 12 ora machine image 14, then PowerConvert 24 manages these machines byrunning jobs on the host Virtual Machine Server 40 (e.g., ESX, GSX orMSVS) or the Machine Image Server 42. Thus, an OFX controller 44 must bedeployed to the host machine container. If the host machine containeralready has an OFX controller 44 installed on it from an earlierconversion, then this step can be skipped.

It is not uncommon for a host machine with an abundance of memory, disk,and CPU resources to be used for multiple purposes. For example, a hostmachine may be running both VMware GSX and MSVS. In this case, bothmachine container applications will share the same OFX controller 44 torun their jobs. There is no need to install a new instance of an OFXcontroller 44 for each machine container on the same host.

Similarly PowerConvert 24 might be installed on a host that is alsorunning MSVS and an Image Server. In this case, PowerConvert controller58, which is actually an instance of an OFX Controller, can be used torun the jobs necessary for the machine containers.

Next is step 86. This step only needs to run in the case of convertingto a virtual machine. When the target machine is a virtual machine,PowerConvert 24 runs a job on the host of the virtual machine server 40to create and manage a virtual machine 12. Each type of virtual machineserver 40 provides its own API that can be used to create and manage oneof its virtual machines. The PowerConvert actions running in the jobsmake calls to the virtual machine server 40 through the available API's.

By default, the properties of a virtual machine 12 are set to reflectthe properties of the source machine. While configuring the conversionin the PowerConvert GUI 26, the user has the option to adjust many ofthe properties of the new virtual machine 12 to make optimal use of theresources available. The following properties of a virtual machine 12may be configured:

a) The display name (as used by the virtual machine server)

b) Memory (RAM)

c) Minimum memory size

d) Memory shares

e) Number and size of the hard disks

f) Hard disk controller types (IDE or SCSI)

g) Number of CPUs

h) CPU min and max, shares and affinity

i) Number of NICs and the mapping to a virtual adapter

Finally, the new virtual machine 12 is registered with the virtualserver 40.

We now move to step 88. This step is run only in the case of conversionto a virtual machine. A virtual machine 12 has been created, but itcannot run because there is no operating system installed on themachine.

The OFX controller 44 on the virtual server 40 is responsible forrunning this job. In this job, the newly created virtual machine ismodified so that it connects to a virtual CDROM, which contains a copyof the boot image (WinPE or Linux Ramdisk). Then the virtual machine isforced to reboot. When the machine restarts, it will boot from theCDROM. The boot image will load, and a controller 44 will be installedand configured.

In this step it is also possible to temporarily modify the memoryfootprint of the virtual machine for the purposes of running it underthe control of PowerConvert 24. For example, it may be suitable for theuser to configure the target virtual machine to run with 128 MB of RAM,but the overall conversion speed can be improved in some situations ifthe machine under control is given additional virtual memory to utilize.

There is no need to take control of a target physical machine during theconversion process, since this already happens during the discoverystage when the machine is booted from the CDROM.

Moving next to step 90, now that the VM has been created and is underthe control of PowerConvert 24, disk partitions and volumes are created.

Moving next to step 94, PowerConvert 24 takes control of a sourcemachine directly. The job runs in PowerConvert controller 58. Theplatform specific boot image is copied to the source machine. Next, theboot configuration file is backed up and modified to refer to the newimage. Finally, the source machine is forced to reboot.

When the source machine reboots, it will boot from the new boot image.In one embodiment, the OFX controller 44 is contained in the boot image,so it does not need to be downloaded. Instead, it only needs to beconfigured. As soon as the machine boots into the boot image, theoriginal boot loader configuration files are restored. This allows themachine to be restored back into its native operating system as soon asit is rebooted, even if it is rebooted in error.

Moving now to step 96 the source and target machines are ready to begincopying. If either the source or target machines are physical or virtualmachines, they are running under control. That is, they are runningwithin a boot image, with a controller configured. If either the sourceor target machines is a machine image, then the controller of themachine container's host is used. In any case, there are two controllersready to handle the copying of files. A ‘copy source’ job is scheduledto run on the source machines controller, and a ‘copy target’ job isscheduled to run on the target machine's controller. In the jobs, oneside binds to a network port and waits for a connection from the otherside. Either the source or the target may be configured to listen on aport. In most cases, it does not matter since the conversion is takingplace between two machines that are under control. Once a connection ismade, the transfer can begin.

PowerConvert 24 uses a file-based copy process. The source side beginswith the root folder of a given volume and traverses the file systemreading each file and folder. As each file and folder is found, thesource side writes it to the socket connection. The data is streamedacross the network in the OFX Package format. On the target side, theOFX Package is read from the network connection, one file at a time. Aseach new file arrives, it is recreated on the target machine with all ofits associated properties. The intention is to recreate each file andfolder exactly as it was on the source machine. The file transfercontinues for each volume that is configured to be copied. The user hasthe option of choosing not to copy one or more volumes, if so desired.Further, some files are not copied from the source machine to optimizethe amount of data to transfer taking into account what can be recreatedby the operating system on the target.

As mentioned earlier, PowerConvert 24 uses a file-based copy process.That is, each individual file and folder is copied from the source tothe target. The alternative to this is an image based copy. In animage-based copy, the entire contents of a file system are read from thedisk byte-by-byte, regardless of the file system.

There are several advantages to using a file-based copy instead of animage-based copy, as follows:

1. Resizing of volumes. At configuration time, the user may decide thatthe size of a volume on the source machine is not optimal for the targetmachine.

-   -   a) For example, the C: drive on a Windows source machine is 20        GB in size, and now near capacity. In this case, the        corresponding volume on the target machine can be configured        with an increased size of, say, 50 GB.    -   b) Similarly, a volume on the source may be underutilized. It        may be sized at 120 GB, but only ever uses about 10 GB. In this        case, the corresponding volume on the target machine can be        configured with a smaller size of, say, 20 GB.        2. Automatic defragmentation of the file system on the target        machine. Any file on the source machine may be fragmented. That        is, its data is not stored contiguously on the disk. During        PowerConvert's file transfer step, files are being written to        the target's disk one file at a time, each file will naturally        occupy the next available sectors of the disk, since the disk        starts off with a clean file system.        3. Filtering specific files so that they are not copied or are        changed during the copy process. Files that can be recreated        without copying, such as the swap file for the Windows operating        system need not be copied which often save 1 GB or more of data.

Moving now to step 98, PowerConvert 24 prepares the operating system toboot while it is still under control. It is only at this time thatPowerConvert 24 has full access to the operating system that has justcompleted copying from the source. The following steps are taken:

a) Update drivers. Device drivers are installed on the Operating System.The drivers installed are those that match the plug and playidentification of the devices on the target machine, which aredetermined at machine discovery time. For devices such as mass storagedevices, it is vital to update the drivers while the machine is undercontrol, otherwise the machine may likely never be able to boot.

b) Update Hardware Abstraction Layer (HAL) and kernel files. Hal andkernel files are updated, if necessary.

c) Update boot configuration file (boot.ini or grub.conf or linux.conf)so that the new machine will boot from the appropriate partition.

d) Update hostname, as configured by the user.

e) Update network connections. At this time for Linux only; it needs tobe done later for Windows.

f) Disable VMware tools, if necessary.

g) Disable MSVS additions, if necessary.

h) Update Windows services or Linux daemons, as configured by the user.

Moving now to step 100, the target virtual machine is restarted. Thisstep runs on the target machine container. Its purpose is to ‘undo’ thetake control. This involves undoing the temporary changes that wereneeded during the take control step, including the disconnection of thevirtual CDROM and resetting the memory size back to the user configuredamount. In the case of a physical machine, this step is run fromPowerConvert controller 58. It schedules a job to run on the targetphysical machines OFX controller 44, and instructs it to reboot.

Step 102 only needs to run for Windows target machines. For Linux, thetarget machine is fully configured by the end of the Prepare OS to Bootstep 98. This step runs within a small Windows service that is injectedinto the target earlier and does the following:

a) Restore mount points on volumes

b) Configure network connections

c) Generate new Session Id

d) Join a domain or workgroup, as configured by user

e) Restore NT4 file security

Step 104 ends the process and is optional. This step brings the sourcemachine out of the PowerConvert boot image and back into the nativeoperating system. The user may want to ‘move’ a machine, instead of‘cloning’ a machine. In this case, they may not want to restart thesource machine, and the machine is left ‘under control’. If the userdoes want their source machine restarted, then PowerConvert 24 willrelinquish control of that machine by running a ‘reboot’ job on thecontroller while the machine is under control.

To aid the reader in understanding the function of OFX 28 we will nowdescribe some OFX terms.

A device is a generic term for a physical or virtual device that can becontrolled. Examples of devices would be a computer, a virtual machine,a software application, a network switch or a group of devices. Devicescan be nested to form a hierarchy. Information on a device includes aGlobally Unique ID (guid), a display name, a security descriptor andExtended Markup Language (XML) instance data. PowerConvert 24 extendsthe use of this instance data to store its own model of a machine objectin XML format as discussed above with reference to FIG. 5.

An OFX job defines a set of actions. Jobs are executed by an OFXcontroller 44 and are versioned. Jobs may be scheduled against devicesor controllers.

Actions provide implementation behavior for jobs. Actions allowdevelopers and users to extend the use of OFX 28 with custom behaviorfor custom solutions and applications. Actions are implemented asdynamic link libraries and are reusable among jobs.

An OFX package is a binary format that is used for file distribution. Itis similar in notion to a tar file or a zip file.

OFX packages may be used in several ways namely:

1. OFX Action packages. When OFX controller 44 needs to execute anaction. To do so it must load a specified .dll. This .dll and anydependent .dll's are archived in a package available for OFX controller44 to download from OFX 28. In this scenario OFX controller 44 requiresthe .dll's with their names and content, but little else.

2. OFX Job packages. These typically contain data files that are neededduring the execution of a job. Multiple job packages can be used in ajob.

3. PowerConvert File Transfer. During a conversion using PowerConvert24, all of the files from a source machine are copied across a networkto a target machine. For each file transferred all of the file'sproperties are transferred with it so that the file can be recreated onthe target machine exactly as it was on the source machine.

4. PowerConvert Image Server. When archiving machine images on an imageserver, for each file in the archive, all of its properties and contentsare stored so that it can be recreated at some later time when themachine image is deployed to a target machine.

The same package format is used for both Windows and Linux and isportable to any operating system format.

The structure of a package comprises four main components, a PackagePreamble, Package Headers, File Headers and Files. We will now describeeach in turn.

A Package Preamble identifies the package version.

Package Headers consists of a set of zero or more Headers that pertainto the package as a whole. The format of Package Headers is as follows:

Number of Headers four bytes

Header 1 variable size

. . .

Header n

Package Headers can be used by the package author to provide additionalinformation or hints about the source or the contents. For example, apackage header could provide a hint about the approximate number offiles or the estimated size of the package. This gives the programreading the package enough information to estimate progress between 1%to 100%.

File Headers have the same format as Package Headers and consist of aset of zero or more Headers. The only difference from Package Headers isthat the properties relate to a file, instead of the entire package.File Headers provide a great deal of flexibility as a new File Headercan be added whenever the need arises to describe some additionalproperty of a special file type. For example File Header names used inone embodiment may include:

Attributes An integer representation of the file attributes

CreationTime An integer representation of the file creation time

LastWriteTime An integer representation of the last modification time

Backup Utilizes Windows backup semantics (Boolean)

A Header can be used for either packages or files. A Header includes aname/value pair that describes a property of either the package or thefile. Each header has a MustUnderstand Boolean flag. When writing apackage the author sets this value to TRUE if the reader must understandthe meaning of this header. When reading a package the reader must checkthis flag. If the reader does not understand the meaning of the header,it will fail. In describing headers and other components we willhereinafter be referring to “Length-prefixed UTF8 string”, this is ashort form for “Four byte length-prefixed eight bit Unicodetransformation format encoded string”. One format of a header is asfollows:

Name Length-prefixed UTF8 string

Value Length-prefixed UTF8 string

MustUnderstand One byte (Boolean)

The Files section consists of a sequence of File entries. The number offiles in the package is never specified. Instead there exists a Booleanmarker with the value TRUE before the beginning of each new File toindicate that a file follows. A Boolean marker with the value FALSE willfollow the last file in the package. With this format the writer of apackage does not need to pre-calculate the number of files in thepackage. This is beneficial during a file transfer between two machines.As each file is read from the disk, it can be immediately sent acrossthe network to the target machine. Thus there is no need to wait untilthe package is fully assembled. The Files section may be embodied in aformat as follows:

MoreFlag One byte (Boolean TRUE)

File 1 Variable size

MoreFlag One byte (Boolean TRUE)

File 2 Variable size

. . .

MoreFlag One byte (Boolean TRUE)

File n Variable size

End of Files Marker One byte (Boolean FALSE)

Each File describes the properties and contents of a file or directory.A File structure is as follows:

Full Name of File Length-prefixed UTF8 string

Is Directory One byte (Boolean)

File Length in Bytes Eight bytes

File Headers Variable Size

File Contents Variable Size

The OFX package format is designed to be flexible for the easy additionof extensions. For example, file compression can be added withoutchanging the File format by adding a File Header with the type“compressed”, for each file that required compression. On the targetmachine the package reader will know that the file is in a compressedformat when it sees the “compressed” File Header.

We will now provide more detail on PowerConvert Web Services Interface56 and OFX Web Services Interface 46.

Power Convert Web Services Interface 46 deals primarily with Operationsand Machines. To deal with these it provides an Operations Web Serviceand a Machine Web Service. The Operation Web Service provides a wrapperaround the OFX job information, especially with respect to tracking therelationships between all of the remote jobs that make up a singleconversion process. GUI 26 calls the Operations Web Service when itwants to check on the status of a conversion, usingOperationWebService.GetOperation( ) The Operation Web Service alsoincludes methods such as AbortOperation and DeleteOperation.

The Machine Web Service handles all request related to machines,including:

a) GetMachine( ), to retrieve the machine properties

b) Discover( ), which schedules a job on PowerConvert controller 58 togather information about the specified machine, and add the results todatabase 62.

c) Undiscover( ) which will remove the machine from database 62.

d) ConvertToMachineContainer( ), which schedules a job on PowerConvertcontroller 58 to convert a source machine to a contained machine (i.e.x2V or x2I).

OFX Web Services Interface 46 provides access to OFX Business Server 60.Web services are organized into three groups: Installation/Setup,Controllers-Only and Runtime.

Installation/Setup web services are used to configure OFX 28 with jobs,actions and packages. These are primarily used at installation time.Examples of Installation/Setup web services are:

a) JobWebService allows the user to add, delete, modify and query jobdefinitions including actions such as: AddJob, GetJob, GetJobs,DeleteJob and Setjob.

b) ActionTypeWebService allows the user to add, delete, modify and queryActionType definitions including actions such as: AddActionType,DeleteActionType, GetActionType and SetActionType.

c) PackageWebService allows the user to add, delete, modify and uploadpackage definitions including actions such as: AddPackage, GetPackage,SetPackage and UploadPackage.

d) SchemaWebService provides a means for defining the structure, contentand semantics of XML documents in more detail. This web service is usedto add and get schemas and includes actions such as, AddSchema,GetSchema and GetSchemas.

e) ImportExportWebService is used to import jobs, action types, devices,schemas and packages into OFX 28. The input is an XML document thatcontains the definitions of one or more job, action types and packages.This web service will call each of JobWebService, ActionTypeWebService,DeviceWebService, SchemaWebService and PackageWebService as required.

f) ConfigurationWebService is used to get and set security descriptorfields in database 62. These include jobs, action types, devices,controllers and packages. Actions include GetRootSecurityDescriptor andSetRootSecurityDescriptor.

Controller-Only web services are used by OFX controllers 44. There aretwo types of Controller-Only web services, namelyControllerNotificationWebService and ControllerPackageDownload.

ControllerNotificationWebService is used by OFX controllers 44 to notifyOFX 28 of the controller status and the status of the jobs they arerunning. The three operations are:

a) Startup, which provides notification to OFX 28 when OFX controller 44starts.

b) Heartbeat, which is used by an OFX controller 44 to inform OFX 28that it (the controller) is still running. An OFX controller 44 sends toOFX 28 a snapshot of any jobs that it is currently running. OFX 28 thensends the OFX controller 44 any new jobs that have been scheduled to runon the OFX controller 44. Typically this occurs every five seconds orso.

c) Shutdown, which provides notification to OFX 28 when an OFXcontroller 44 stops.

ControllerPackageDownload is used by OFX controllers 44 to download anypackages that are required to execute a job. A job is never executeduntil an OFX controller 44 has successfully downloaded all of thedependent packages. Packages are defined by a guid and a version. When aOFX controller 44 downloads a package, the package is cached on disk, incase it is needed later to run another job that has the same packagedependency.

Runtime Web Services are of three types, JobSchedulingWebService,ControllerWebService and DeviceWebService.

JobSchedulingWebService includes the following services:

a) ScheduleJob, which is used for defining a new job instance, alongwith input parameters and the device or controller for the job to beexecuted on.

b) GetScheduledJob, which returns information about the specified job.This includes the job status, the status of each action in the job andany input and output data.

c) GetControllerJobs, which returns all of the jobs related to aspecific controller 44.

d) AbortJob, which aborts a running job.

e) DeleteJob, which is used to delete a scheduled job when it is nolonger needed on the system.

ControllerWebService includes the following services:

a) AddController, which is used to define and configure a new OFXcontroller 44 just before the controller is deployed to a machine.

b) SetController, which is used is used for modifying the properties ofcontroller 44.

c) DeleteController, which is used to delete a specified OFX controller44 entry in the database.

d) GetController, which returns its properties and attributes of aspecified controller.

e) GetControllers, which returns properties and attributes of allcontrollers 44.

f) GetControllerEventLogEntries, which returns the event log entries forOFX controller 44.

g) ClearControllerEventLogEntries, which clears the event log entriesfor OFX controller 44.

DeviceWebService allows the user to add, delete, modify and query devicedefinitions. It includes services such as: AddDevice, SetDevice,GetDevice, GetDevices, GetDevicelds, and DeleteDevice.

An example of a SOAP request and response for the serviceControllerWebService.GetController follows. The fields in bold areplaceholders that need to be replaced with actual values. EXAMPLEControllerWebService.Getcontroller POST /ofxweb/Controller.asmx HTTP/1.1Host: shark.platespin.com Content-Type: text/xml; charset=utf-8Content-Length: length SOAPAction:“http://schemas.platespin.com/ofx/ws/GetController” <?xml version=“1.0”encoding=“utf-8”?> <soap:Envelopexmlns:xsi=“http://www.w3.org/2001/xmlSchema- instance”xmlns:xsd=“http://www.w3.org/2001/xmlSchema”xmlns:soap=“http://schemasxmlsoap.org/soap/envelope/”>  <soap:Body>  <id xmlns=“http://schemas.platespin.com/ofx/ws/”>string</id>  <options xmlns=“http://schemas.platespin.com/ofx/ws/”>Properties orData or SecurityDescriptor</options>  </soap:Body> </soap:Envelope>HTTP/1.1 200 OK Content-Type: text/xml; charset=utf-8 Content-Length:length <?xml version=“1.0” encoding=“utf-8”?> <soap:Envelopexmlns:xsi=“http://www.w3.org/2001/xmlSchema- instance”xmlns:xsd=“http://www.w3.org/2001/xmlSchema”xmlns:soap=“http://schemasxmlsoap.org/soap/envelope/”>  <soap:Body>  <controller id=“string” status=“Unknown or Running or Stopped”dateCreated=“dateTime” dateModified=“dateTime” bootFile=“string”bootFromNetwork=“boolean” bootPlatform=“Windows or Linux”bootExpected=“boolean” xmlns=“http://schemas.platespin.com/ofx/ws/”>   <description>string</description>    <data>xml</data>   <macAddresses>     <macAddress>string</macAddress>    <macAddress>string</macAddress>    </macAddresses>   <securityDescriptor>base84Binary</securityDescriptor>   </controller> </soap:Body> </soap:Envelope>

Referring now to FIG. 7, a block diagram of the components of an OFXcontroller 44 is shown. An OFX controller 44 runs on a computer host tocontrol one or more OFX devices, and it executes jobs provided to it byOFX 28. The design of OFX controllers 44 is independent of platformarchitecture and uses OFX Web Services Interface 62 to communicate withOFX 28. OFX controllers 44 are generic in that they know nothing aboutthe actions they are executing. All the code needed to execute an actionis downloaded on demand from OFX 28.

An OFX controller 44 for Windows is deployed from OFX 28 by usingNetBIOS and WMI to copy OFX controller 44 to a remote machine andregister it as a Windows Service. Similarly a OFX controller 44 for aLinux machine is deployed using the SSH protocol. In both cases,administrator credentials are required on the targeted machine.

At deployment time each OFX controller 44 is configured with:

a) The URL of OFX 28, which the OFX controller 44 uses to contact OFX28.

b) A unique guid so that it can be identified by OFX 28.

c) A randomly generated symmetric encryption key for security. This“secret” along with a nonce value are used as a signature for thepurpose of preventing replay attacks.

As shown in FIG. 7, an OFX controller 44 comprises five main components,notification service 110, job manager 112, scheduler service 114,package manager 116 and job execution process(es) 118.

Notification service 110 regularly checks in with OFX Business Server 60through OFX Web Services Interface 62 to report status and to determineif any new jobs are waiting. This checking in is referred to as a“heartbeat” and occurs frequently, typically on the order of every fiveseconds but is user definable. On each heartbeat OFX controller 44 willsend a snapshot of the status of each running job and the latest logfile entries. Log file entries are maintained by the controller toindicate the status of a job. Examples of log file entries are; jobreceived and package downloaded. A log file provides a running status ofthe job progression. Notification Service 110 will receive any new jobsthat have been scheduled to run on an OFX controller 44 since the lastheartbeat. An OFX controller 44 will keep sending a snapshot of a givenjob at the heartbeat until the job has completed running. The benefit ofsuch a design is that the dataflow across a network is minimized.

Job manager 112 is responsible for persisting all of the job XML fileson disk. When notification service 110 receive a new job from OFX 28, itforwards that job to job manager 112 which will immediately persist thatjob XML to disk. Next, job manager 112 will notify scheduler service 114that a new job has arrived. Also, when notification service 110 ispreparing a heartbeat for OFX 28 it will ask job manager 112 for all thejobs that have been modified since the last successful heartbeat. As ajob executes the job data will change. For example the status of anaction will change from “NotRun” to “Running” when the action starts.Job manager 112 stores this information in the form of an XML file injob XML folder 120.

Scheduler service 114 is responsible for the running of each job.Scheduler service 114 schedules a job for a given time and controls aqueue of jobs to be run. Before a job can be executed any dependentpackages must first be downloaded, so package manager 116 is sent arequest to download the packages. Each job has one or more associatedpackages, which contain everything needed to execute a job. The packagesrequired to run a job are specified in the job XML. Scheduler service114 executes a job by spawning a separate job execution process 118.Scheduler service 114 then waits for the process 118 to exit, at whichtime it will check the exit status of the process. If the job executionprocess 118 failed for any reason, scheduler service 114 is responsiblefor setting the job status to “Failed” by modifying the job XML in jobXML folder 120. Finally, scheduler service 114 informs job manager 112that the job has completed.

Each job is run in its own process 118 to protect any other runningjobs. There is one new process created for each job that needs to berun. There is no limit to the number of jobs that can run concurrentlyon a single OFX controller 44, except for the usual memory, disk and CPUresource constraints of a machine.

The job execution process 118 is responsible for running the job. Itdoes this by loading and executing each of the actions specified in thejob. The job execution process is also responsible for setting thestatus of each action as it changes, as well as the overall job status.It is also responsible for flushing the job XML to the Job XML folder120 at the completion of each action.

Package manager 116 communicates with an OFX file server 122 to downloadall packages specified in a job. OFX file server 122 is under thecontrol of OFX 28. Package manager 116 will also store its own cache ofpackages that have been downloaded previously in packages folder 124.Packages are identified by a name (guid) and a version. Package manager116 will access cached packages in packages folder 124. As the number ofpackages to download for a specific job can vary, and their sizes canvary, any request to package manager 116 is asynchronous. Packagemanager 116 will notify scheduler service 114 when it completesdownloading all of the packages for a specific job. Package manager 116automatically handles retries, in the case that the download of apackage fails because of some temporary network difficulties.

To better illustrate the types of information utilized by OFX 28 we willnow briefly describe the contents of database 62. In one embodimentdatabase 62 is a relational SQL database containing a plurality oftables. The main tables provide information on: packages, scheduledjobs, controllers, devices and actions. As an example of the structureof database 62, the tables for scheduled jobs and controllers would belinked together by controller id. The scheduled jobs table would includeinformation about a job, such as a job id, a job version, a device id,status, date scheduled and other fields. The controllers table wouldinclude information about a controller such as: id, security descriptor,description, pointers to a bootfile, status, and other fields. It is notthe intent of the inventors to restrict the use of the present inventionto a specific implementation of database 62 but rather to indicate thatit serves as a repository for OFX 28.

FIG. 8 is a block diagram of the components of PowerRecon 32. PowerRecon32 is designed to aid in the consolidation of operating systems,applications and data on servers in data center 22. PowerRecon 32monitors the servers to collect information and provides detailed planson how a consolidation may be accomplished. Information may be collectedthrough the use of Windows services such as WMI or Windows PerformanceCounters. In the case of Linux commands to collect information may bemade through a service such as SSH. A user through the use of GUI 26 mayexamine the information collected by PowerRecon 32 and select whichconsolidations should occur.

PowerRecon 32 comprises two main modules, Web Services ApplicationProgramming Interface (API) 130 and Software Developer's Kit (SDK) 132which communicate with each other. Web Services API 130 comprises threemodules, inventory 134, performance 136 and analysis 138.

In one embodiment there are five groups of methods provided by WebServices 130. They are:

1) Autonomic. These provide analysis and optimization web services.

2) Inventory. These provide inventory gathering, machine and machinecontainer information, group information, and security credentials.Machines contain containers, groups can contain other groups andmachines.

3) Nodes. These provide hierarchical definitions of the groups, machinesand containers.

4) Performance Data Collection. These provide methods for the startingand stopping of data collection and retrieval of performance data.

5) Reports. These report on actions that the system is performing. Theyreturn information of the state and progress of a task such as runningan analysis, running inventory or running optimization.

Examples of methods in Web Services API 130 used by inventory 134include:

a) Import. This imports information about a machine by returning the idof the machine as stored in the PowerOptimize database 146.

b) Inventory. This discovers a set of servers of the same platform typesuch as all ESX machines.

Examples of methods in Web Services API 130 used by performance 136include:

a) Start. This starts performance monitoring on a machine.

b) Stop. This stops performance monitoring on a machine.

c) GetMetricData. This gets the metric data collected for a machine.

d) GetMonitoredMachines. This provides a list of machines beingmonitored for performance metrics.

Examples of methods in WebServices API 130 used by analysis 138 include:

a) AnalyseMachine. This starts an analysis on a single machine todetermine if it is within a set of thresholds to determine if it isoverused, underused or within a target range.

b) StartAnalysis. This performs the same tests as AnalyseMachine butdoes it on a group of machines.

Inventory 134, through the use of OFX controllers described earlier inreference to OFX 28 collects information on a server. Information to becollected includes detailed information about machines and containers.The information collected is stored in PowerOptimize database 146. Thisinformation may be refreshed at the request of the user or automaticallyon a regular schedule.

Performance 136 examines the data collected in PowerRecon database 148.This data includes information on the performance data that needs to becollected: such as: disk I/O, CPU usage, CPU pages NIC Megabits perSecond (Mbps) and memory usage. PowerRecon database 148 also storeshistorical performance data that has been collected. As a part of theWeb Services API 130, it returns the data requested from database 148.

Analysis 138 examines the information collected by inventory 134 andperformance 136 stored in databases 146 and 148 to make suggestions tooptimize the performance of the servers in data center 22. FIG. 9illustrates the logical steps of analysis 138.

PowerRecon SDK 132 comprises two main modules, database gatherer 140 andrealtime gatherer 142. When a request is received to obtain informationon a server in data center 22 from Web Services API 130, PowerRecon SDK132 utilizes a Simple Performance Inspector (SPI) interface 144 toobtain information on the server. An SPI 144 may be created to act as anadapter for any source of performance data. An SPI 144 runs on a serverand is configured to the server platform to utilize platform specifictools. Information may be collected using a variety of methods such asSSH for Linux, Windows Performance Counters, or tools provided byproducts such as VMware to best collect the data. In addition, multipleSPI's mitigate losses of data.

Database gatherer 140 extracts information from a server and returns itto Web Services API 130 to be stored in database 148. Realtime gatherer142 collects information on a server in realtime and returns it to WebServices API 130 to be displayed to a user via GUI 26.

FIG. 9 is a flow chart of the functionality of the analysis portion(138) of PowerRecon 32 and is shown generally as 150. Beginning at step152 databases 146 and 148 are queried for inventory and performanceinformation on each server submitted for analysis. At step 154 apossibility test is made. For example are there enough resourcesavailable on a target machine to move other machines to it? If threemachines each require 1 GB of RAM and the target machine has only 2 GBof RAM than such a transfer is not possible. Moving to step 156 anintelligence test is performed. This step verifies that the usage ofeach resource (e.g. Disk I/O and CPU) for the machines to be transferreddoes not conflict. For example if two machines have intensive disk I/Ousage, it would be useful to have them on servers with low I/O usage. Inanother example if two machines have high CPU usage, from 3:00PM to6:00PM Monday to Friday it would be efficient to have them on separateservers. Conversely if one machine had a high CPU usage from 8:00 to12:00 from Monday to Friday and another from 12:00 to 4:00, they wouldbe good candidates to reside on the same server. Any number of analysesmay be used to determine a good fit for machines on servers based uponresource use such as: mean, trend analysis, standard deviation, ormin/max values. At this step a suggestion may also be made of the sizeof a new virtual machine. For example if a machine was configured for 2GB of RAM, but only used 256 MB most of the time, a suggestion may bemade to reduce the RAM of the target machine to 512 MB and perhapsincrease the paging file size. In another scenario, if the server is aheavily used single processor machine, a suggestion could be made tomove to a two CPU machine.

Moving to step 158 the solutions are generated and stored in database148 so that the user may view and change them via GUI 26. Any changesmade by the user are verified by steps 152 and 156. If the user selectsa solution, it is then passed to PowerConvert 24 for execution.

Moving to step 160 the user may request after a certain amount of timeto return to step 152 to generate another analysis for the newconfiguration to determine if it is functioning as expected.

Referring now to FIG. 10 a block diagram of the components ofPowerOptimize are shown. PowerOptimize 34 is directed toward loadbalancing, right sizing and self healing of data centers. Through theuse of PowerRecon 32 it can utilize both historical and real time datato provide suggested changes. PowerOptinmize 34 comprises five mainlogic engines namely: expert systems 170, bin packing 172, neuralnetworks 174, fuzzy logic 176 and user options 178. Suggestions 180coordinates the information collected by each engine and interacts withanalysis 138 and GUI 26 to allow a user to select an action to be taken.Alternatively, should the user wish an action to be taken automatically,suggestions 180 will instruct PowerConvert 24 to make the suggestedchanges.

Expert systems 170 is based upon facts, rules, actions and an inferenceengine. The number of facts grows each time the inference engine runs.The rules define what should be done with the facts. The facts are acollection of known elements such as: machines, metric data and,thresholds. Examples of metric data would include: processor utilizationby CPU, memory usage, disk space, bytes read and written from and to adisk, and bytes sent and received for a NIC. Thresholds are values aboveor below which a decision can be made.

The rules define how the system acts based upon the facts. For example:

a) If a CPU is overused, then add virtual machines on a server to handlethe overuse.

b) If a server has room then convert from a physical to a virtualmachine, i.e. does it have enough disk space, memory, CPU and NIC tosupport the physical machine as a virtual machine.

c) If the number of virtual machines on a server is beyond a threshold(e.g. four) then no new virtual machines may be added.

Rule c) is an example of a constraint, which may restrict movements ofmachines or images. The inference engine process the rules based upon:preferences (e.g. CPU usage should be maximized), constraints (e.g.maximum of 4 VM machines on a server), cost and priorities and parameterassignment. Parameter assignment appends a new value to the facts.

Expert systems 170 conducts an analysis within the context of a set ofmachines (the machine pool), comprised of source and target candidates.The logic as follows describes a rule referred to as “the zone”. Thezone is the middle part of the low and high thresholds a user may definefor a machine's usage. For example if the low threshold is 30, and thehigh threshold is 70, the user had identified that they would likeperformance for a component such as a CPU or disk to be within the zonebetween 30 and 70. This can be defined per machine, per component.

a) Target machines may be any container and empty physical machines.

b) The heuristic for this rule is straightforward; for every metric bywhich the performance of a machine is evaluated, there is a desiredrange. The goal is to have all physical machines within the zone.

c) Within the desired range is a “sweet spot” used to rank machines andprioritize alternate solution scenarios.

d) If a physical machine in the pool is in the zone, it shall not bemoved.

e) If a virtual machine in the pool is in the zone, it may be convertedto another virtual server to enable more efficient use of virtualmachines on virtual servers.

f) If all the target candidates in the machine pool are in the zone, noconversions are possible.

f) If all source machines are in the zone, no conversions are suggested.

g) Physical machines that are below the range should be moved to avirtual machine server, thereby freeing up the physical machine. P2Vconversions are prioritized in such a way as to maximize the number ofvirtual machines on a virtual machine server.

h) In the absence of P2V candidates, virtual machines are migratedbetween underutilized virtual machine servers. The priority is to havethe virtual machine server with the least powerful hardware in the zonefirst. The rational is that by leaving the most powerful available,there will be greater flexibility for future analyses.

i) Physical servers that are above the zone should be left where theyare as there is not a solution. If more powerful hardware exists for aconversion target then a possible solution is P2V or P2P conversion.

For decision making, performance statistics for CPU usage will beaveraged over all instances. For example, percent usage will be averagedover both processors in a dual CPU server. NIC usage will be consideredper NIC, since NIC's can be on separate networks. Disk usage isconsidered on an individual basis. To get disk usage in the zone mayrequire a conversion with disk resizing. Memory is targeted for the zonebased on peak usage.

An example algorithm to select machines for conversion would include thesteps:

1) Loop once through the machine pool to select candidate targetmachines and candidate source machines.

2) If the list of target candidates is empty, return.

3) If the list of source candidates is empty, return.

Based upon the candidates selected we propose two solutions. The firstsolution is a best fit packing of virtual machines into virtual machineservers. In the list of source candidates include the virtual machinesalready on virtual machine servers. Then loop through all permutationsof source and target machines to find the best layout. The secondsolution is a variation on the first solution. Virtual machines are lefton the virtual machine servers where they currently reside if the serveris below the zone. Then search permutations of the remaining source andtarget machines to find the best layout.

The above solutions can be implemented simply by defining a set ofcomparison operators and methods, namely:

a) Vmserver.CanFit(machine a): True or False depending on whether avirtual machine server has enough of the correct resources to support“machine a” as a virtual machine.

b) Vmserver.NotFull( ): True if the virtual machine server can host morevirtual machines.

c) machine.BiggerThan(machine a): True or False depending on whether“machine a” has a bigger footprint than the current machine. A footprintis the collective usage of all components on a machine. It is ann-dimensional representation of the machine's usage of physical hardwareand performance data.

d) machine.LessThan(machine a): True or False depending on whether“machine a” has a smaller footprint than the current machine.

Bin packing 172 attempts to optimally assign different types ofresources, for example, disk space, memory, or CPU usage to a specificmachine to determine if a fit can be made for transferring machines.

The general problem is figuring out the best way to pack objects intocontainers. The problems can be described as the set of NP-completealgorithms known as multi dimensional bin packing and multi dimensionalknapsack problem. A number of algorithms may be pursued to solve thisproblem, some examples are:

a) First Fit (FF). In this scenario, objects arrive in an unsorted listand are packed into the first bin they fit in.

b) First Fit Decreasing (FFD). In this scenario, objects are sorted indecreasing order and then FF is applied.

c) Best Fit (BF). In this scenario, objects arrive in an unsorted listand are packed into the bin that would leave the least amount of space.

d) Best Fit Decreasing (BFD). In this scenario, the objects are orderedin decreasing order and BF is applied.

e) Next Fit (NF). In this scenario, objects arrive in an unsorted listand packed into the first bin they fit into. Subsequent objects arepacked in the bin starting after the bin where the last object waspacked.

f) Next Fit Decreasing (NFD). In this scenario, objects are sorted indecreasing order and NF is applied.

g) Worst Fit (WF). In this scenario, the objects are unsorted and packedin order into the bin that would leave the most amount of space.

h) Worst Fit Decreasing (WFD). In this scenario, the objects are sortedin decreasing order and WF is applied.

i) Permutation Pack (PP). In this scenario, the objects are ordered inincreasing order. An object is packed into a container where the objecthas an inverse resource distribution. For example where the object hasmemory usage>disk usage>cpu usage, then find a container where memoryusage<disk usage<cpu usage.

j) Custom 1 (C1). In this scenario, the objects are ordered inincreasing order and the containers are ordered in increasing order.Then the objects are packed in a round robin manner.

k) Custom 2 (C2). In this scenario, the objects are ordered inincreasing order and the containers are ordered in increasing order.Objects are packed in to a container based upon the mean value of theirsize. For example if a container is to hold four items, find fourobjects whose size is roughly the mean divided by four. Find the mean/4object in the sorted list and take the subsequent items from either sideof it.

Neural networks 174 is a learning model. A neural network containsnodes, which commonly have two inputs and an output. The output iseither on or off. The outputs are provided again to other nodes in thenetwork or to other networks. The weights of the connections betweennodes changes as the system learns from data passed through the system.As the network, or any other logic engine makes suggestions on what todo, the user may make a decision on what is actually the best course ofaction from their perspective. This information would then be fed backto the neural network 174 to make it learn.

Fuzzy logic 176 uses human readable rules to make decisions. Rather thanusing thresholds such as “memory usage>70%”, fuzzy rules may be applied,such as: if memory is above average and memory is increasing then addmore memory. In the use of fuzzy logic the rules are not precise butdescriptive of the goal to be achieved. Membership functions drive thefuzzy logic engine. It describes the importance of each input as a setof fuzzy numbers. Based upon these fuzzy numbers from the membership, aconcrete value is returned. Fuzzy logic 176 could be used to monitor andchange resources on a server which are easily modified on a runningserver. For example without stopping a virtual machine from running,memory and paging could be adjusted as needed. For certain virtualservers, virtual machines may be moved onto faster servers.

User options 178 allows the user to enter their own criteria forsuggesting machine conversions. For example a user could enter valuesfor fuzzy logic fields such as:

a) CPU_VERY_OVERUSED, CPU_OVERUSED, CPU_OPTIMAL, CPU_UNDERUSED,CPU_VERY_UNDERUSED, for each virtual and physical machine

b) VM_SHARES_UNDERALLOCATED, VM_SHARES_OPTIMAL, VM_SHARES_OVERALLOCATED,for virtual machines, globally or for all virtual machines on a virtualserver.

c) SHORT_TIME, SOME_TIME, LONG_TIME, to define time scales forcollecting and assessing data globally.

Although the subject matter has been described in language specific tostructural features and/or methodological acts, it is to be understoodthat the subject matter defined in the appended claims is notnecessarily limited to the specific features or acts described above.Rather, the specific features and acts described above are disclosed asexample forms of implementing the claims. By way of example note thatthe inventors refer to the use of Windows and Linux environments,specific VM products and specific tools such as WinPE and SSH. Oneskilled in the art will recognize that the present invention isstructured to be portable across operating systems and easily adaptableto different computing environments and other virtual machinetechnology. APPENDIX 1 Machine XML Sample <?xml version=“1.0”encoding=“utf-8” ?> <VMwareESXVirtualMachinexmlns:xsd=“http://www.w3.org/2001/XMLSchema”xmlns:xsi=“http://www.w3.org/2001/XMLSchema-instance”xmlns=“http://schemas.platespin.com/athens/ws/”><productVersion>Version5_1</productVersion><productVersionAtCreation>Version5_1</productVersionAtCreation><lastUpdateTime>2005-10-05T10:04:50.0619494- 04:00</lastUpdateTime><id>e6c2ef66-b9b9-4c63-bf16-6a57c66ace44</id> <manufacturer>VMware,Inc.</manufacturer> <model>VMware Virtual Platform</model><smbiosUUID>0ed94d56-dcc2-83a3-404a-6602a5041d94</smbiosUUID><serialNumber>VMware-56 4d d9 0e c2 dc a3 83-40 4a 66 02 a5 04 1d94</serialNumber> <operatingSystem xsi:type=“MicrosoftOperatingSystem”> <productVersion>Version5_1</productVersion><productVersionAtCreation>Version5_1</productVersionAtCreation> <type>Windows2000</type>  <hostName>vm3.platespin.com</hostName> <address>192.168.80.102</address>  <version>5.0.2195</version><networkConnections>  <networkConnection>   <name>Local AreaConnection</name>   <ipAddresses>    <ipAddress>    <address>192.168.80.102</address>    <subnetMask>255.255.255.0</subnetMask>    </ipAddress>  </ipAddresses>   <macAddress>00-0C-29-04-1D-94</macAddress>  <dhcpEnabled>false</dhcpEnabled>  <DefaultGateways>192.168.80.1</DefaultGateways>  <DnsServers>192.168.220.10</DnsServers>  <DnsServers>10.10.220.10</DnsServers>  <WinsServers>192.168.50.66</WinsServers>  <WinsServers>127.0.0.0</WinsServers>   <peerDns>false</peerDns> </networkConnection> </networkConnections> <volumes>  <volume>  <fileSystem>NTFS</fileSystem>   <size>3133796352</size>  <freeSpace>1624932352</freeSpace>  <serialNumber>8062f759</serialNumber> <label />  <mountPoints>C:</mountPoints>  <partitions>\disk0\partition0\</partitions>  <isDynamic>false</isDynamic>   <isCompressed>false</isCompressed> </volume>  <volume>   <fileSystem>NTFS</fileSystem>  <size>535805440</size>   <freeSpace>530402816</freeSpace>  <serialNumber>581d5335</serialNumber>   <label>Compressed</label>  <mountPoints>D:</mountPoints>  <partitions>\disk1\partition0\</partitions>   <isDynamic>false</isDynamic>    <isCompressed>true</isCompressed> </volume> </volumes> <installedPrograms>  <installedProgram>  <displayName>WebFldrs</displayName>   <size>2556</size>  <version>9.50.7522</version>   <category>SystemComponent</category> </installedProgram> </installedPrograms><acpiSupported>true</acpiSupported> <domain>PSACME</domain><servicePack>4.0</servicePack><windowsDirectory>C:\winnt</windowsDirectory> <description /><pageFiles>  <pageFile>   <location>C:\pagefile.sys</location>  <maxSize>805306368</maxSize>  </pageFile> </pageFiles><hardwareProfile>1</hardwareProfile><languageType>English</languageType> <systemFileList>  <systemFileInfo>  <fileName>ntoskrnl.exe</fileName>  <internalName>ntoskrnl.exe</internalName>   <companyName>MicrosoftCorporation</companyName>  <productVersion>5.00.2195.6717</productVersion>  <languageType>English</languageType>  </systemFileInfo> <systemFileInfo>   <fileName>ntkrnlpa.exe</fileName>  <internalName>ntkrnlpa.exe</internalName>   <companyName>MicrosoftCorporation</companyName>  <productVersion>5.00.2195.6717</productVersion>  <languageType>English</languageType>  </systemFileInfo> <systemFileInfo>   <fileName>hal.dll</fileName>  <internalName>halaacpi.dll</internalName>   <companyName>MicrosoftCorporation</companyName>  <productVersion>5.00.2195.6691</productVersion>  <languageType>English</languageType> </systemFileInfo></systemFileList>  <windowsServices>   <windowsService>  <name>Abiosdsk</name>   <description />  <displayName>Abiosdsk</displayName>   <status>Stopped</status>  <startMode>Disabled</startMode>   <type>KernelDriver</type>  <pathToExecutable />  </windowsService>  <windowsService>  <name>abp480n5</name>   <description />  <displayName>abp480n5</displayName>    <status>Stopped</status>   <startMode>Disabled</startMode>    <type>KernelDriver</type>   <pathToExecutable />  </windowsService> - <!-- removed some forclarity --> </windowsServices> <controlSet>1</controlSet></operatingSystem> <memory>268435456</memory> <status>Running</status><components>  <component xsi:type=“NetworkAdapter”>  <manufacturer>Advanced Micro Devices (AMD)</manufacturer>   <model>AMDPCNET Family PCI Ethernet Adapter</model>   <deviceId>0</deviceId>   <pnpId>PCI\    VEN_1022&DEV_2000&SUBSYS_20001022&REV_10\   3&61AAA01&0&88</pnpId>    <macAddress>00-0C-29-04-1D-94</macAddress> </component>  <component xsi:type=“DiskDrive”>  <manufacturer>(Standard disk drives)</manufacturer>   <model>VMwareVirtual disk SCSI Disk Device</model>  <deviceId>\\.\PHYSICALDRIVE0</deviceId>   <pnpId>SCSI\    DISK&VEN_VMWARE&PROD_VIRTUAL_DISK&REV_1.0\    4&5FCAAFC&0&000</pnpId>   <size>3142056960</size>  <type>SCSI</type>  <partitions>   <partition>   <name>\disk0\partition0\</name>    <size>3133799424</size>   <startingOffset>32256</startingOffset>    <active>true</active>   <partitionType>7</partitionType>    <primary>true</primary>  </partition>  </partitions> </component> <componentxsi:type=“DiskDrive”>  <manufacturer>(Standard diskdrives)</manufacturer>  <model>VMware Virtual disk SCSI DiskDevice</model>  <deviceId>\\.\PHYSICALDRIVE1</deviceId>  <pnpId>SCSI\  DISK&VEN_VMWARE&PROD_VIRTUAL_DISK&REV_1.0\   4&5FCAAFC&0&010</pnpId> <size>536870912</size>  <type>SCSI</type>  <partitions>   <partition>   <name>\disk1\partition0\</name>    <size>535805952</size>   <startingOffset>16384</startingOffset>    <active>false</active>   <partitionType>7</partitionType>    <primary>true</primary> </partition> </partitions> </component> <componentxsi:type=“Processor”>  <manufacturer>GenuineIntel</manufacturer> <model>Intel(R) Xeon(TM) CPU 3.06GHz</model>  <deviceId>CPU0</deviceId> <speed>3059</speed> </component> <componentxsi:type=“ScsiRaidController”>  <manufacturer>BusLogic</manufacturer> <model>BusLogic MultiMaster PCI SCSI Host Adapter</model> <deviceId>PCI\   VEN_104B&DEV_1040&SUBSYS_1040104B&REV_01\  3&61AAA01&0&80</deviceId>  <pnpId>PCI\  VEN_104B&DEV_1040&SUBSYS_1040104B&REV_01\   3&61AAA01&0&80</pnpId> <driverName>buslogic</driverName> </component> </components><role>None</role> <PlateSpinDiscovered>true</PlateSpinDiscovered><operatingSystemType>Windows2000</operatingSystemType><numberOfCpus>1</numberOfCpus> <cpuMin>1</cpuMin> <cpuMax>1</cpuMax></VMwareESXVirtualMachine>

1. A system for remotely monitoring usage of machines in a data centerand suggesting conversions between machines to make efficient use ofresources in said data center, said system comprising: a) a datacollection engine; and b) an optimization engine operatively coupled tosaid data collection engine.
 2. The system of claim 1 wherein saidsystem is operatively coupled to a machine conversion engine for thepurpose of executing a conversion remotely and automatically.
 3. Thesystem of claim 1 further comprising a machine conversion engine and ajob management engine operatively coupled to each of said datacollection engine and said optimization engine.
 4. The system of claim 1wherein said data collection engine comprises means for collecting dataand storing it in a database and means for collecting data in real timeand presenting it to a user via a graphical interface.
 5. The system ofclaim 1 wherein said data collection engine comprises a web servicesinterface, said interface comprising an inventory module, a performancemodule and an analysis module.
 6. The system of claim 5 wherein saidinventory module comprises means for collecting information on theconfiguration of said machines and storing the same in a database. 7.The system of claim 5 wherein said performance module comprises meansfor extracting performance information from a database of informationcollected by said data collection engine and means for providing it to arequester.
 8. The system of claim 5 wherein said analysis modulecomprises means for examining possible conversions between machines andmeans for determining the viability of a conversion.
 9. The optimizationengine of claim 1 wherein said optimization engine utilizes one or moreanalysis modules to provide suggestions on converting machines.
 10. Theoptimization engine of claim 9 wherein said analysis module is an expertsystem.
 11. The optimization engine of claim 9 wherein said analysismodule comprises means for utilizing bin packing.
 12. The optimizationengine of claim 9 wherein said analysis module is a neural network. 13.The optimization engine of claim 9 wherein said analysis modulecomprises means for utilizing fuzzy logic.
 14. The optimization engineof claim 9 wherein said analysis module comprises means for acceptinganalysis parameters from a user.
 15. A method for remotely monitoringusage of machines in a data center to make efficient use of resources inthe data center, the method comprising the steps of: collectingperformance and machine data; analyzing the data; and suggestingconversions between machines.
 16. The method of claim 15 furthercomprising the step of executing a conversion remotely andautomatically.
 17. The method of claim 15 further comprising the step ofstoring said data in a database and presenting it to a user via agraphical interface.
 18. The method of claim 15 further comprising thestep of collecting said data in real time and presenting it to a uservia a graphical interface.
 19. The method of claim 15 wherein saidanalyzing utilizes an expert system.
 20. The method of claim 15 whereinsaid analyzing utilizes bin packing.
 21. The method of claim 15 whereinsaid analyzing utilizes a neural network.
 22. The method of claim 15wherein said analyzing utilizes fuzzy logic.
 23. The method of claim 15further comprising the step of accepting analysis parameters from auser.
 24. A computer readable medium, said medium comprisinginstructions for remotely monitoring usage of machines in a data centerto make efficient use of resources in the data center, said instructionsimplementing the steps of: collecting performance and machine data;analyzing the data; and suggesting conversions between machines.